Networks and Trees

Susan Holmes and Wolfgang Huber

July 10th 2017

Networks and Trees

Darwin

Darwin

Networks and trees are often used to represent biological data. Phylogenetic trees were used to represent family relationships between species even before Darwin’s famous notebook sketch above.

Networks are often used to schematize complex interactions between proteins involved in diseases as in this Figure:

There are many examples where networks and trees occur in biology including:

Goals for this lecture

This lecture will be focused on ways of integrating graphs into a data analytic workflow.

Graphs

What is a graph and how can it be encoded?

A graph is formed by a set of nodes or vertices: called \(V\)
and a set of edges between these vertices \(E\).

An adjacency matrix \(\mathbf{A}\) is the matrix representation of \(E\). \(\mathbf{A}\) is a square matrix with as many rows as nodes in the graph. \(\mathbf{A}\) contains a non zero entry in the \(i\)th row and \(j\)th column if there is an edge between the \(i\)th and \(j\)th vertices.

library("igraph")
edges1 = matrix(c(1,3,2,3,3,4,4,5,4,6),byrow=TRUE,ncol=2)
edges1
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    3
## [3,]    3    4
## [4,]    4    5
## [5,]    4    6
adj =matrix(0,nrow=6,ncol=6)
edges2=edges1[,c(2,1)]
adj[edges2]=1
adj[edges1]=1
adj
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    0    0    1    0    0    0
## [2,]    0    0    1    0    0    0
## [3,]    1    1    0    1    0    0
## [4,]    0    0    1    0    1    1
## [5,]    0    0    0    1    0    0
## [6,]    0    0    0    1    0    0
g1 = graph_from_edgelist(edges1,directed=FALSE)
plot(g1, vertex.size=25,edge.width=5, vertex.color="coral")
A small graph

A small graph

Graph Elements:

We will call a weighted, directed graph a network . Networks have adjacency matrices \({\bf A}\) which are \(n\) by \(n\) matrices of positive numbers corresponding to the edge lengths.

AdjacencyMatrix

AdjacencyMatrix

Graph statistics

If the number of edges is similar to the number of nodes, \[\#E\sim O(\#V)\] the graph is sparse .

Some graphs have many nodes, ppiData contains a predicted protein interaction ( ppipred ) graph on about 2500 proteins with 20000 edges, if an adjacency matrix were used to stock the information, 6250000 elements would have to be stored, in fact a list of edges and a number are sufficient and only use 7500 elements.

If a graph is sparse we use encodings similar to those implemented in sparseMatrix that only records the list of edges.

On the other hand, if the number of edges is almost a quadratic function of the nodes (written \(\#E\sim {\Large \mathbf{O}}(\#V^2)\)), our graph is dense and memory space can become an issue for the storage of very large networks.

Our simple graph will often be enriched; becoming a network :

Arrows and directed edges

In directed graphs we differentiate between in-degrees and out-degrees and can identify graphs that have cycles.

Annotation variables on nodes and edges

The strength of a link in a graph can be visualized by the width of the edge. Covariates can be added to nodes. A continuous covariate can be associated to the size of the node in the graphical representation. A categorical value associated to the color.

Graph layout

We will see several examples where the same graph is plotted in different ways, either for aesthetic or practical reasons, this is done through the choice of the graph layout .

Edges represent distances –> 2D representation of the graph = multidimensional scaling we saw in lecture 11 .

fruchterman reingold often appears

Graphs as sparse distances

Graphs can be thought of as ways of simplifying distance matrices by making the smaller distances zero. This is usually done by defining a threshold \(t\) and then making edges between points whose distances are smaller than the threshold \(t\).

library(ggnetwork)
g1df=ggnetwork(g1)
ggf=ggplot(g1df,aes(x=x, y=y, xend=xend, yend=yend))+
 geom_edges()+geom_nodes(aes(x=x, y=y),size=6,color="#8856a7")+
 geom_nodetext(aes(label = vertex.names),size=4,color="white")+
 theme_blank() + theme(legend.position="none")

An example: a four state Markov chain

As we have seen in lecture 2, we can use a Markov chain to summarize transitions between nucleotides (considered the states of the system). This is often schematized by a graph, the igraph package enables many choices for graph ‘decoration’:

library("markovchain")
statesNames = c("A", "C", "G","T")
T1MC = new("markovchain", states = statesNames, transitionMatrix =
          matrix(c(0.2,0.1,0.4,0.3,0,1,0, 0, 0.1,0.2,0.2,0.5,0.1,0.1,0.8,0.0),
          nrow = 4,byrow = TRUE, dimnames=list(statesNames,statesNames)))
plot(T1MC, edge.arrow.size=0.7, vertex.color="purple",edge.arrow.width = 2.2,
      edge.width = 4, edge.color = "blue",edge.curved = TRUE,edge.label.cex=2.5,
      vertex.label.cex = 3.5, edge.loop.angle = 3, vertex.size= 30,
      vertex.label.family="sans", vertex.label.color = "white")
A four state Markov chain with arrows representing possible transitions

A four state Markov chain with arrows representing possible transitions

Markov Chains

Markov chains are idealized models of dynamical systems and the states are represented by the nodes in the graph. The transition matrix gives us the weights on the directed edges (arrows) between the states.

Markov model

Markov model

Graphs with many layers: labels on edges and nodes

Here is an example of plotting a graph downloaded from the string database http:string.org with annotations at the vertices.

This graph shows the perturbed chemokine subnetwork uncovered in Yu et al. (2012) using differential gene expression patterns in sorted Tcells.

Notice the clique -like structure of CXCR3, CXCL13, CCL19, CSCR5 and CCR7 in the right hand corner.

A full perturbed chemokine subnetwork discovered in the study of breast cancer metastasis using GXNA (Nacu et al. 2007) and reported by Yu et al. (2012).

However, understanding the underlying biology requires more than just a laundry list of significant players in a biological system.

image

image

From Gene Set Enrichment to Networks

In lecture 6 , we studied methods for finding a list of differentially expressed genes.

Low power to detect interpretable perturbations.

Methods using pre-defined gene sets (GSEA)

One of the earliest approaches was to look for gene attributes that are overrepresented or enriched in the laundry list of significant genes.

These gene classes are often based on Gene Ontology (GO) categories (for example, genes that are involved in organ growth, or genes that are involved in feeding behavior) are used to find these enriched categories.

The Gene Ontology (GO) is a collection of three ontologies that describe genes and gene products.

Look for the enrichment of a GO term in this list, we will give this term a statistical meaning below.

Gene set analysis with two-way table tests

Yellow Blue Yellow
Significant 25 25 25
Universe 500 100 400

Here, we start by explaining a basic: hypergeometric testing.

Define a universe of candidate genes that may potentially significant; say this universe is of size \(N\).

We also have a record of the genes that actually did come out significant, of which we suppose there were \(m\).

Toy model involving balls in boxes, with a total of \(N\) balls corresponding to the genes identified in the gene universe.

These genes are split into different functional categories, suppose there are \(N=1,000\) genes, of which 500 are yellow, 100 are blue and 400 are red. Then a subset of \(m=75\) genes are labeled as significant , suppose among these significantly interesting genes, there are 25 yellow, 25 red and 25 blue.

Is the blue category enriched or overrepresented?

We use this hypergeometric two-way table testing to account for the fact that some categories are extremely numerous and others are rarer.

By Monte Carlo

universe = c(rep("Yellow",500),rep("Blue",100),rep("Red",400))
countblue=rep(0,20000)
for (i in (1:20000))
{
pick75=sample(universe,75,replace=FALSE)
countblue[i]=length(which(pick75=="Blue"))
}
summary(countblue)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    6.00    7.00    7.51    9.00   19.00
Simulated null distribution

Simulated null distribution

The histogram shows that having a value as large as 25 under the null model would be extremely rare.

Conclude that the blue are enriched.

In the general case, the gene universe is an urn with \(N\) balls, if we pick the \(m\) balls at random and there is a proportion of \(k/N\) blue balls, we expect to see \(k\times m/N\) blue balls in a draw of size \(k\).

Plotting gene enrichment networks with GOplot

Here we show an attractive way of summarizing the connections between the gene functional categories and the significant gene set.

library("GOplot"); data(EC)
circ  =  circle_dat(EC$david, EC$genelist)
chord  =  chord_dat(circ, EC$genes, EC$process)
GOChord(chord, limit = c(0,5))

Correspondence between GO terms and significantly changed genes in a study on differential expression in endothelial cells from two steady state tissues (brain and heart, see Nolan et al. 2013).

In fact, the Gene Ontology graph does not necessarily capture meaningful gene interactions as genes from different processes often interact productively.

Significant subgraphs and high scoring modules

Known skeleton graph onto which we project a significance scores or p-value from our differential expression.

Ideker et al. (2002) and developed further in Nacu et al. (2007) and carefully implemented with many improvements in the BioNet (Beisser et al. 2010);

subgraphs or modules of the scored-skeleton network that seem to be particularly perturbed .

Beisser et al. (2010) models the pvalues of the genes as we did in lecture7 : mixture of ‘non-perturbed‘ genes whose pvalues will be uniformly distributed and non uniformly distributed pvalues from the perturbed genes.

We model the signal in the data using a beta distribution for the p-values.

Given our node-scoring function; we search for connected hotspots in the graph, ie a subgraph of genes with high combined scores.

An example with the BioNet implementation

To illustrate the method, we show data from the BioNet package.

The dataLym contains the relevant pvalues and \(t\) statistics for 3,583 genes, you can access them and do the analysis as follows:

library("BioNet")
library("DLBCL")
data(dataLym)
data(interactome)
interactome
## A graphNEL graph with undirected edges
## Number of Nodes = 9386 
## Number of Edges = 36504
pval=dataLym$t.pval
names(pval)  =  dataLym$label
subnet  =  subNetwork(dataLym$label, interactome)
#We remove the self loops from the graph
subnet  =  rmSelfLoops(subnet)
subnet
## A graphNEL graph with undirected edges
## Number of Nodes = 2559 
## Number of Edges = 7788

Fit a Beta-Uniform model

fb=fitBumModel(pval, plot = FALSE)
fb
## Beta-Uniform-Mixture (BUM) model
## 
## 3583 pvalues fitted
## 
## Mixture parameter (lambda):  0.482
## shape parameter (a):         0.180
## log-likelihood:          4471.8
scores=scoreNodes(subnet, fb, fdr = 0.001)

The qqplot shows the quality of the fit of beta-uniform mixture model to the data.

\(\pi_0\) and a beta distribution (proportional to \(a x^{a - 1}\)) for the p-values corresponding to the alternatives (Pounds and Morris 2003). \[f(x|a,\pi_0)= \pi_0 + (1-\pi_0) a x^{a - 1}\qquad \mbox{ for } 0 <x \leq 1; \; 0<a<1\]1 Running the model with an fdr of 0.001:

Then we run a heuristic search for a high scoring subgraph using:

hotSub  =  runFastHeinz(subnet, scores)
hotSub
## A graphNEL graph with undirected edges
## Number of Nodes = 153 
## Number of Edges = 243
logFC=dataLym$diff
names(logFC)=dataLym$label
par(mfrow=c(1,1))
plotModule(hotSub, scores = scores, diff.expr = logFC)

The functional subgraph that maximizes differential expression between ABC and GCB B-cell lymphoma is colored in red and green, where green shows an upregulation in ACB and red an upregulation in GBC. The shape of the nodes depicts the score: rectangles indicate a negative score, circles a positive score.

Phylogenetic Trees

image

image

Trees are graphs with no cycles. Phylogenetic trees are rooted binary trees with labels at the tips.

Examples

Linnaeus

Linnaeus

Haeckel

Haeckel

Haeckel

xkcd

xkcd

Tree history of languages?

Languages

Languages

The example of HIV

image

image

HIV is a virus that protects itself by evolving very fast, its evolution can be followed in real time, as opposed to the evolution of large organisms which has happened over millions of years.

HIV trees are built for medical purposes such as the detection and understanding of drug resistance. They are built on individual genes, because different genes can show differences in their evolutionary histories and thus produce different gene trees .

The phylogenetic tree shows times when the virus switched from monkeys to humans (Wertheim and Worobey 2009).

Special elements in phylogenies

Markovian Models for evolution.

Molecular clock

Molecular clock

Continuous Markov chain and the generator matrix

We are going to use the Markov chain we saw above (states A,C,G,T)

We consider that the changes of state, i.e. mutations, occur at random times.

\[\begin{equation} Q = \begin{array}{lcccc} & A &T &C & G\\ A&-3\alpha & \alpha &\alpha&\alpha\\ T&\alpha & -3\alpha&\alpha &\alpha\\ C&\alpha & \alpha &-3\alpha&\alpha\\ G&\alpha &\alpha &\alpha &-3\alpha\\ \end{array}\qquad Q= \begin{array}{lcccc} & A &T &C & G\\ A&-\alpha-2\beta & \beta &\beta&\alpha\\ T&\beta & -\alpha-2\beta &\alpha &\beta\\ C&\beta & \alpha &-\alpha-2\beta&\beta\\ G&\alpha &\beta &\beta &-\alpha-2\beta\\ \end{array} (\#eq:JCKimura) \end{equation}\]

Question Why do we say the Kimura model is more flexible ?

The most flexible parametric model is called the Generalized Time Reversible (GTR) model; it has 6 free parameters.

Simulating data and plotting a tree

We know our phylogenetic tree, and simulate the evolution of the nucleotides down this tree.

First we visualize the tree tree1 using ggtree :

library("phangorn")
library("ggtree")
load("/Users/susan/Books/CUBook/data/tree1.RData")
ggtree(tree1,lwd=2,color="coral",alpha=0.8,right=TRUE) +
                   geom_tiplab(size=7,angle=90) +
geom_point(aes(shape=isTip, color=isTip), size=5, alpha=0.6)

Generate some sequences from our tree, each new nucleotide-letter is generated randomly at the root. As it goes down the tree; some mutations may occur.

seqs6=simSeq(tree1,l=60,type="DNA",bf=c(1/8,1/8,3/8,3/8),rate=.1)
seqs6
## 6 sequences with 60 character and 23 different site patterns.
## The states are a c g t
mat6=as.character(seqs6)
mat6df=data.frame(mat6)
p=ggtree(tree1,lwd=1.2)+geom_tiplab(aes(x=branch),size=5,vjust=2)
gheatmap(p,mat6df[,1:60],offset=0.01,colnames=FALSE)

The standard Markovian models of evolution we saw above allows us to improve these estimates.

Estimating a phylogenetic tree

When the true tree-parameter: probabilistic generative models of evolution results in sequences.

There are several approaches to estimation: tree ‘building’ is no exception, here are the main ones:

A nonparametric estimate: the parsimony tree

Parsimony is a nonparametric method that minimizes the number of changes necessary to explain the data, it’s solution is the same as that of the Steiner tree problem.

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    0    0    1    0    0    0
## [2,]    0    0    1    0    0    0
## [3,]    1    1    0    1    0    0
## [4,]    0    0    1    0    1    1
## [5,]    0    0    0    1    0    0
## [6,]    0    0    0    1    0    0
A parametric estimate: the maximum likelihood tree

In order to estimate the tree using a maximum likelihood or Bayesian approach one needs a model for molecular evolution that integrates mutation rates and branch edge lengths.

ML estimation ( Phyml, FastML,RaxML) use efficient optimization algorithms to maximize the likelihood of a tree under the model assumptions.

Bayesian posterior distributions for trees

Bayesian estimation, MrBayes (Ronquist et al. 2012) or BEAST (Bouckaert et al. 2014) both use MCMC to find posterior distributions of the phylogenies.

Bayesian methods are not directly integrated into R: we can import the collections of trees generated by Monte Carlo methods in order to summarize them and make confidence statements see Chakerian and Holmes (2012) for simple examples.

The semi-parametric approach: distance based methods

These methods, called Neighbor Joining and UPGMA, are quite similar to the hierachical clusering algorithms

Distance estimation steps uses the parametric evolutionary models of Table above , that is the parametric part in semi-parametric.

The Neighbor-joining algorithm itself uses Steiner points as the summary of two combined points, and proceeds iteratively as in hierarchical clustering. It can be quite fast and is often used as a good starting point for the more time-consuming methods.

Let’s start by estimating the tree from the data seqs6 using the nj (neighbor joining) on DNA distances based on the one-parameter Jukes-Cantor model:

tree.nj = nj(dist.ml(seqs6,"JC69"))
ggtree(tree.nj)+geom_tiplab(size=7)+ggplot2::xlim(0, 0.8)

fitJC = pml(tree1, seqs6, k=1)
fitJC = optim.pml(fitJC)
## optimize edge weights:  -392.2323 --> -216.4778 
## optimize edge weights:  -216.4778 --> -216.4778 
## optimize edge weights:  -216.4778 --> -216.4778 
## optimize edge weights:  -216.4778 --> -216.4778
fitJC
## 
##  loglikelihood: -216.4778 
## 
## unconstrained loglikelihood: -147.7867 
## 
## Rate matrix:
##   a c g t
## a 0 1 1 1
## c 1 0 1 1
## g 1 1 0 1
## t 1 1 1 0
## 
## Base frequencies:  
## 0.25 0.25 0.25 0.25

Parsimony

treeMP = pratchet(seqs6)
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
treeMP <- acctran(tree1, seqs6)
set.seed(123)
BStrees = bootstrap.phyDat(seqs6, pratchet, bs = 100)
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 25"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 23"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 39"
## [1] "Best pscore so far: 39"
## [1] "Best pscore so far: 39"
## [1] "Best pscore so far: 39"
## [1] "Best pscore so far: 39"
## [1] "Best pscore so far: 39"
## [1] "Best pscore so far: 39"
## [1] "Best pscore so far: 39"
## [1] "Best pscore so far: 39"
## [1] "Best pscore so far: 39"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 33"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 37"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 21"
## [1] "Best pscore so far: 18"
## [1] "Best pscore so far: 18"
## [1] "Best pscore so far: 18"
## [1] "Best pscore so far: 18"
## [1] "Best pscore so far: 18"
## [1] "Best pscore so far: 18"
## [1] "Best pscore so far: 18"
## [1] "Best pscore so far: 18"
## [1] "Best pscore so far: 18"
## [1] "Best pscore so far: 18"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 34"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 30"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 19"
## [1] "Best pscore so far: 19"
## [1] "Best pscore so far: 19"
## [1] "Best pscore so far: 19"
## [1] "Best pscore so far: 19"
## [1] "Best pscore so far: 19"
## [1] "Best pscore so far: 19"
## [1] "Best pscore so far: 19"
## [1] "Best pscore so far: 19"
## [1] "Best pscore so far: 19"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 36"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 29"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 22"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 31"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 35"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 28"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 24"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 26"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 27"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
## [1] "Best pscore so far: 32"
treeMP = plotBS(treeMP, BStrees, "phylogram")
add.scale.bar()

Application to 16S rRNA data

In lecture 5 we saw how to use a probabilistic clustering method to denoise 16S rRNA sequences. We can now reload these denoised sequences and preprocess them before building their phylogeny.

library("dada2")
seqtab = readRDS("/Users/susan/Books/CUBook/data/seqtab.rds")
seqs = getSequences(seqtab)
names(seqs) = seqs

Benefits of using well-studied marker loci 16S rRNA gene taxonomically classify the sequenced variants.

dada2 includes a naive Bayesian classifier method for this purpose (Wang et al. 2007).

fastaRef = "../tmp/rdp_train_set_16.fa.gz"
taxtab = assignTaxonomy(seqtab, refFasta = fastaRef)

We obtain a table of taxonomic information:

taxtab = readRDS(file= "/Users/susan/Books/CUBook/data/taxtab16.rds")
dim(taxtab)
## [1] 268   6
taxtab %>% `rownames<-`(NULL) %>% head
##      Kingdom    Phylum          Class         Order          
## [1,] "Bacteria" "Bacteroidetes" "Bacteroidia" "Bacteroidales"
## [2,] "Bacteria" "Bacteroidetes" "Bacteroidia" "Bacteroidales"
## [3,] "Bacteria" "Bacteroidetes" "Bacteroidia" "Bacteroidales"
## [4,] "Bacteria" "Bacteroidetes" "Bacteroidia" "Bacteroidales"
## [5,] "Bacteria" "Bacteroidetes" "Bacteroidia" "Bacteroidales"
## [6,] "Bacteria" "Bacteroidetes" "Bacteroidia" "Bacteroidales"
##      Family               Genus        
## [1,] "Porphyromonadaceae" NA           
## [2,] "Porphyromonadaceae" NA           
## [3,] "Porphyromonadaceae" NA           
## [4,] "Porphyromonadaceae" "Barnesiella"
## [5,] "Bacteroidaceae"     "Bacteroides"
## [6,] "Porphyromonadaceae" NA

Note: Because the data were randomly generated they are ‘cleaner’ than the real data we will have to handle. In particular the raw sequences will have to be aligned , as in Figure fig:aligned , before applying the tree building algorithms. We perform a multiple-alignment using the DECIPHER package (Wright 2015):

Aligned Malaria

Aligned Malaria

library("DECIPHER")
alignment = AlignSeqs(DNAStringSet(seqs), anchor = NA, verbose = FALSE)

We will use the phangorn package to build the MLE tree (under the GTR model), but will use the neighbor-joining tree as our starting point.

We will use the package to build the MLE tree (under the GTR model), but will use the neighbor-joining tree as our starting point.

phangAlign = phangorn::phyDat(as(alignment, "matrix"), type = "DNA")
dm = phangorn::dist.ml(phangAlign)
treeNJ = phangorn::NJ(dm) # Note, tip order != sequence order
fit = phangorn::pml(treeNJ, data = phangAlign)
fitGTR = update(fit, k = 4, inv = 0.2)
fitGTR = phangorn::optim.pml(fitGTR, model = "GTR", optInv = TRUE, optGamma = TRUE,
      rearrangement = "stochastic", control = phangorn::pml.control(trace = 0))

Combining a phylogenetic trees into a data analysis

DataIntegration

DataIntegration

We now need to combine the phylogenetic tree, the denoised read abundances with the complementary information provided about the samples from which the reads were gathered.

This sample information is often provided as a spreadhseet (or .csv), and (mis)named the meta -data.

This data integration step is facilitated by the specialized containers and accessors that phyloseq provides.

We read in the sample data from csv file:

mimarksPathClean = "/Users/susan/Books/CUBook/data/MIMARKS_Data_clean.csv"
samdf = read.csv(mimarksPathClean, header=TRUE)

Then the full suite of data for this study – the sample-by-sequence feature table, the sample (meta)data, the sequence taxonomies, and the phylogenetic tree – are combined into a single object as follows:

library("phyloseq")
physeq = phyloseq(tax_table(taxtab), sample_data(samdf),
        otu_table(seqtab, taxa_are_rows = FALSE), phy_tree(fitGTR$tree))

This block will only usually be executed once. Once created, we save the object and do all the manipulations starting from that container.

We have already encountered several cases of combining heterogeneous data into special data classes (in particular in lecture 7, when we studied the pasilla data).

phyloseq documentation here.

We can also make data transformations, while maintaining the integrity of the links between all the data components.

Plotting the abundances for certain bacteria can easily be performed using barcharts, ggplot2 commands have been hardwired into one-line calls in the phyloseq package (see labs).

There is even an interactive Shiny-phyloseq browser based tool (McMurdie and Holmes 2015).

Biota

Biota

Microbiome

Microbiome

Hierarchical multiple testing

Hypothesis testing can identify individual bacteria whose abundance relates to sample variables of interest.

A standard approach is very similar to the approach we already visited in lecture 7 .

To integrate this information, Benjamini and Yekutieli (2003) and Benjamini and Bogomolov (2014) proposed a hierarchical testing procedure; where taxonomic groups are only tested if higher levels are found to be be associated.

In the case where many related species have a slight signal, this pooling of information can increase power.

We apply this method to test the association between microbial abundance and age.

Start by using the normalization protocols we discussed in lecture 7
following Love, Huber, and Anders (2014) for RNA-seq data and McMurdie and Holmes (2014) for 16S rRNA generated count data and available in the DESeq2 package:

library("DESeq2")
library("phyloseq")
ps1 = readRDS("/Users/susan/Books/CUBook/data/ps1.rds")
ps_dds = phyloseq_to_deseq2(ps1, ~ ageBin + family_relationship)
gm_mean = function(x, na.rm = TRUE){
  exp(sum(log(x[x > 0]), na.rm = na.rm) / length(x))
}
geoMeans = apply(counts(ps_dds), 1, gm_mean)
ps_dds = estimateSizeFactors(ps_dds, geoMeans = geoMeans)
ps_dds = estimateDispersions(ps_dds)
abund = getVarianceStabilizedData(ps_dds)

We use the structSSI package to perform the hierarchical testing (Sankaran and Holmes 2014).

Hierarchical testing procedure needs univariate tests for each higher-level taxonomic group, not just every taxa.

library("structSSI")
el = phy_tree(ps1)$edge
el0 = el
el0 = el0[rev(seq_len(nrow(el))), ]
el_names = c(rownames(abund), seq_len(phy_tree(ps1)$Nnode))
el[, 1] = el_names[el0[, 1]]
el[, 2] = el_names[as.numeric(el0[, 2])]
unadj_p = treePValues(el, abund, sample_data(ps1)$ageBin)

We can now do our FDR calculations using the hierarchical testing procedure. The test results are guaranteed to control several variants of FDR, but at different levels; we defer details to (Benjamini and Yekutieli 2003; Benjamini and Bogomolov 2014; Sankaran and Holmes 2014). Task Try the following code, including the interactive plotting command that will open a browser window:

hfdr_res = hFDR.adjust(unadj_p, el, 0.75)
summary(hfdr_res)
#plot(hfdr_res, height = 5000) # opens in a browser
tax = tax_table(ps1)[, c("Family", "Genus")] %>% data.frame()
tax$seq = rownames(abund)
hfdr_res@p.vals$seq = rownames(hfdr_res@p.vals)
tax %>%  left_join(hfdr_res@p.vals[,-3]) %>%
  arrange(adjp) %>% head(10)
##                 Family            Genus       seq       unadjp
## 1      Lachnospiraceae             <NA>  GCAAG.96 1.673295e-82
## 2  Erysipelotrichaceae             <NA>  GCGAG.46 1.134371e-79
## 3      Lachnospiraceae        Roseburia  GCAAG.71 3.078334e-75
## 4      Lachnospiraceae Clostridium_XlVa GCAAG.190 8.173451e-59
## 5      Lachnospiraceae             <NA> GCAAG.254 3.227107e-50
## 6      Lachnospiraceae Clostridium_XlVa GCAAG.150 1.056944e-49
## 7      Lachnospiraceae             <NA>  GCAAG.30 9.057568e-49
## 8  Erysipelotrichaceae     Turicibacter   GCAAG.4 2.896917e-48
## 9      Ruminococcaceae     Ruminococcus  GCGAG.78 1.978950e-46
## 10     Lachnospiraceae Clostridium_XlVa GCAAG.170 6.107952e-43
##            adjp
## 1  3.346591e-82
## 2  2.268741e-79
## 3  6.156668e-75
## 4  1.634690e-58
## 5  6.454215e-50
## 6  2.113888e-49
## 7  1.811514e-48
## 8  5.793834e-48
## 9  3.957900e-46
## 10 1.221590e-42

It seems that the most strongly associated bacteria all belong to family Lachnospiraceae.

Minimum Spanning Trees

A very simple and useful graph is the so-called minimum spanning tree ( MST ).

Given distances between vertices; the MST is the tree that spans all the points and has the minimum total length.

Greedy algorithms work well for computing the MST and there are many implementations in : ( mstree in ade4 ), mst in ape , spantree in vegan or mst in igraph .

A spanning tree is a tree that goes through all points at least once, the top graph with red edges is such a tree. The minimum spanning tree is the spanning tree of minimum length; the lower blue graph is the MST.

DNA sequence distances between strains of HIV

Patients all over the world and construct their minimum spanning tree:

The minimum spanning tree computed from DNA distances between HIV sequences from samples taken in 2009 and whose country of origin was known, data as published in the HIVdb database (Rhee et al. 2003).

It could be preferable to use a
graph layout that incorporates the known geographic coordinates.

The input to the miminimum spanning tree algorithm is a distance matrix or a graph with a length edge attribute.

The DNA distances was computed using the Jukes-Cantor mutation model.

Question How could we test similarity between geographic distances and DNA distances: Mantel.

MST based testing: the Friedman–Rafsky test

Graph-based two-sample tests were introduced by Friedman and Rafsky (Friedman and Rafsky 1979) as a generalization of the Wald-Wolfowitz runs test.

Graph vertices associated with covariates.

Test whether the covariate is significantly associated to the graph structure.

The Friedman-Rafsky tests for two/multiple sample segregation on a minimum spanning tree.

It was conceived as a generalization of the univariate Wald-Wolfowitz runs test. If we are comparing two samples, say men and women, whose coordinates represent a measurement of interest.

We color the two groups blue and red as below.

The Wald-Wolfowitz test looks for long runs of the samecolor that would indicate that the two groups have different means.

Instead of looking for consecutive values of one type (’runs’), we count the number of connected nodes of the same type.

Once the minimum spanning tree has been constructed, the vertices are assigned ‘colors’ according to the different levels of a categorical variable. We call pure edges those whose two nodes have the same level of the factor variable. We use \(S_O\), the number of pure edges as our test statistic. To evaluate whether our observed value could have occurred by chance when the groups have the same distributions, we permute the vertix labels (colors) randomly and recount how many pure edges there are. This label swapping is repeated many times, creating our null distribution for \(S\).

Example: Bacteria sharing between mice

ps1  = readRDS("/Users/susan/Books/CUBook/data/ps1.rds")
sampledata = data.frame(sample_data(ps1))
d1 = as.matrix(phyloseq::distance(ps1, method="jaccard"))
gr = graph.adjacency(d1, mode = "undirected", weighted = TRUE)
net = igraph::mst(gr)
V(net)$id = sampledata[names(V(net)), "host_subject_id"]
V(net)$litter = sampledata[names(V(net)), "family_relationship"]
gnet=ggnetwork(net)
ggplot(gnet, aes(x = x, y = y, xend = xend, yend = yend))+
  geom_edges(color = "darkgray") +
  geom_nodes(aes(color = id, shape = litter)) + theme_blank()+
  theme(legend.position="bottom")
MST plot

MST plot

We make a ggnetwork object from the resulting igraph generated minimum spanning tree and then plot it.

Now we compute the null distribution and p-value for the test, this is implemented in the phyloseqGraphTest package:

library("phyloseqGraphTest")
gt = graph_perm_test(ps1, "host_subject_id", distance="jaccard",
                     type="mst",  nperm=1000)
gt$pval
## [1] 0.000999001

We can take a look at the complete histogram of the null distribution generatedby permutation using:

plot_permutations(gt)

Different choices for the skeleton graph

It is not necessary to use an MST for the skeleton graph that defines the edges.

Graphs made by linking nearest neighbors (Schilling 1986)

Distance thresholding work also works well (sometimes called geometric graphs).

phyloseq has functionality for creating graphs based on thresholding a distance matrix through the function make_network.

Create a network by creating a edge between samples whose Jaccard dissimilarity is less than a threshold, which we set below via the parameter max.dist .

Resulting network, shown below that there is grouping of the samples by both mouse and litter.

net = make_network(ps1, max.dist = 0.35)
sampledata = data.frame(sample_data(ps1))
V(net)$id = sampledata[names(V(net)), "host_subject_id"]
V(net)$litter = sampledata[names(V(net)), "family_relationship"]
netg = ggnetwork(net)
ggplot(netg, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(color = "darkgray") +
  geom_nodes(aes(color = id, shape = litter)) + theme_blank()+
    theme(legend.position="bottom")

Sometimes it will preferable to adjust the permutation distribution to account for known structure between the covariates.

Friedman–Rafsy test with nested covariates

individual mice (the host_subject_id variable).

a litter (the family_relationship variable) effect ?

plot_test_network(gtnn1)

We permute the family_relationship labels but keep the host_subject_id structure intact.

gt = graph_perm_test(ps1, "family_relationship",
          grouping = "host_subject_id",
          distance = "jaccard", type = "mst", nperm= 1000)
gt$pval
## [1] 0.001998002
plot_permutations(gt)

gtnn1 = graph_perm_test(ps1, "family_relationship",
                      grouping = "host_subject_id",
                      distance = "jaccard", type = "knn", knn = 1)
gtnn1$pval
## [1] 0.01

The dual graph

Microbial ‘communities’ as they assemble in the microbiome.

Transpose the data.

Note: always prefer Jaccard to correlation networks.

Summary of this lecture

To summarize what you’ve learned in this lecture :

Useful Packages for Incorporating Graphs into an Analysis

A full list packages that deal with graphs and networks is available here: http://www.bioconductor.org/packages/release/BiocViews.html#___GraphAndNetwork

References

Beisser, Daniela, Gunnar W Klau, Thomas Dandekar, Tobias Müller, and Marcus T Dittrich. 2010. “BioNet: An R-Package for the Functional Analysis of Biological Networks.” Bioinformatics 26 (8). Oxford Univ Press: 1129–30.

Benjamini, Yoav, and Marina Bogomolov. 2014. “Selective Inference on Multiple Families of Hypotheses.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 76 (1). Wiley Online Library: 297–318.

Benjamini, Yoav, and Daniel Yekutieli. 2003. “Hierarchical Fdr Testing of Trees of Hypotheses.” Technical report, Department of Statistics; Operations Research, Tel Aviv University.

Bouckaert, Remco, Joseph Heled, Denise Kühnert, Tim Vaughan, Chieh-Hsi Wu, Dong Xie, Marc A Suchard, Andrew Rambaut, and Alexei J Drummond. 2014. “BEAST 2: A Software Platform for Bayesian Evolutionary Analysis.” PLoS Comput Biol 10 (4). Public Library of Science: e1003537.

Chakerian, John, and Susan Holmes. 2012. “Computational Tools for Evaluating Phylogenetic and Hierarchical Clustering Trees.” Journal of Computational and Graphical Statistics 21 (3). Taylor & Francis Group: 581–99.

Friedman, Jerome H, and Lawrence C Rafsky. 1979. “Multivariate Generalizations of the Wald-Wolfowitz and Smirnov Two-Sample Tests.” The Annals of Statistics, 697–717.

Ideker, Trey, Owen Ozier, Benno Schwikowski, and Andrew F Siegel. 2002. “Discovering Regulatory and Signalling Circuits in Molecular Interaction Networks.” Bioinformatics 18 Suppl 1 (January): S233–40. http://bioinformatics.oxfordjournals.org/cgi/reprint/18/suppl\_1/S233.

Love, Michael I, Wolfgang Huber, and Simon Anders. 2014. “Moderated Estimation of Fold Change and Dispersion for Rna-Seq Data with Deseq2.” Genome Biology 15 (12). Springer: 1–21.

McMurdie, Paul J, and Susan Holmes. 2014. “Waste Not, Want Not: Why Rarefying Microbiome Data Is Inadmissible.” PLoS Computational Biology 10 (4). Public Library of Science: e1003531.

———. 2015. “Shiny-Phyloseq: Web Application for Interactive Microbiome Analysis with Provenance Tracking.” Bioinformatics 31 (2). Oxford Univ Press: 282–83.

Nacu, Serban, Rebecca Critchley-Thorne, Peter Lee, and Susan Holmes. 2007. “Gene Expression Network Analysis and Applications to Immunology.” Bioinformatics 23 (7, 7): 850–8. doi:10.1093/bioinformatics/btm019.

Pounds, Stan, and Stephan W Morris. 2003. “Estimating the Occurrence of False Positives and False Negatives in Microarray Studies by Approximating and Partitioning the Empirical Distribution of P-Values.” Bioinformatics 19 (10). Oxford Univ Press: 1236–42.

Rhee, Soo-Yon, Matthew J Gonzales, Rami Kantor, Bradley J Betts, Jaideep Ravela, and Robert W Shafer. 2003. “Human Immunodeficiency Virus Reverse Transcriptase and Protease Sequence Database.” Nucleic Acids Research 31 (1). Oxford Univ Press: 298–303.

Ronquist, Fredrik, Maxim Teslenko, Paul van der Mark, Daniel L Ayres, Aaron Darling, Sebastian Höhna, Bret Larget, Liang Liu, Marc A Suchard, and John P Huelsenbeck. 2012. “MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space.” Systematic Biology 61 (3). Oxford University Press: 539–42.

Sankaran, Kris, and Susan Holmes. 2014. “StructSSI: Simultaneous and Selective Inference for Grouped or Hierarchically Structured Data.” Journal of Statistical Software 59 (1): 1–21.

Schilling, Mark F. 1986. “Multivariate Two-Sample Tests Based on Nearest Neighbors.” Journal of the American Statistical Association 81 (395). Taylor & Francis: 799–806.

Wang, Q., G.M. Garrity, J.M. Tiedje, and J.R. Cole. 2007. “Naive Bayesian Classifier for Rapid Assignment of RRNA Sequences into the New Bacterial Taxonomy.” Applied and Environmental Microbiology 73 (16). Am Soc Microbiol: 5261.

Wertheim, Joel O, and Michael Worobey. 2009. “Dating the Age of the Siv Lineages That Gave Rise to Hiv-1 and Hiv-2.” PLoS Computational Biology 5 (5). Public Library of Science: e1000377.

Wright, Erik S. 2015. “DECIPHER: Harnessing Local Sequence Context to Improve Protein Multiple Sequence Alignment.” BMC Bioinformatics 16 (1). BioMed Central: 1.

Yu, Hongxiang, Diana L Simons, Ilana Segall, Valeria Carcamo-Cavazos, Erich J Schwartz, Ning Yan, Neta S Zuckerman, et al. 2012. “PRC2/Eed-Ezh2 Complex Is up-Regulated in Breast Cancer Lymph Node Metastasis Compared to Primary Tumor and Correlates with Tumor Proliferation in Situ.” PloS One 7 (12). Public Library of Science: e51239.


  1. The package actually gives a different name to \(\pi_0\): they use \(\lambda\) and call it the mixing parameter.